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1. INTRODUCTION 

Cardiac disorders (CDs) are increasingly recognized as the world’s leading cause of death. The 
disorders include the cardiac muscle and the vascular system supplying the brain, heart, and other vital 
organs [1, 2]. Identifying subsets of the CD by using an electrocardiogram (ECG) signal has been one of the 
great advances of modern medicine. However, the expert does not always realize the importance of classifying 
diseases based on the ECG signal. Due to a high mortality rate of CDs, early detection of the normal and 
abnormal ECG signal is essential for the patient’s treatment. By manually different ECG waveforms of patients, 
with the domain workload, the experts may have misunderstood and can affect the precise judgment for CDs 
diagnose. The signal morphology can get changed by any irregularity in the cardiac rhythm or cardiac muscle 
damage. Nevertheless, the normal ECG can differ for each person, and two distinct diseases can have about 
the same effects on normal ECG signals. Therefore, the automatic classification scheme is needed to address 
the misinterpretation of the ECG signal variability. 

The classification process of ECG plays the most crucial role in the clinical diagnosis of CDs [3]. 
After identifying the abnormality, CDs can be detected, and the patients get better treatment. Currently, 
computer-aided diagnosis (CAD) would be able to provide the CD classifications. It can develop an ECG signal 
classification algorithm with various signal processing techniques to improve ECG classification performance. 
The classification process can provide substantial input to experts to confirm the diagnosis. Unfortunately, 
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there are several significant problems in ECG signal classification [4, 5], e.g., lack of standardization of ECG 
features, the variability of patients ECG waveforms, non-existence of optimal classification rules, and the 
confidential information and different kinds of noise in ECG signals, such as baseline drift and powerline 
interference. Besides, the available labeled data are necessary to enhance the precision of the classification 
process. If only a limited number of labeled examples is available, the classification performance may be quite 
unsatisfactory [6]. Hence, according to such conditions, the ECG signal classification on CD is desirable to 
investigate to produce accurate automatic diagnostic. 

Several methods have been proposed to overcome the ECG signal classification problem with good 
results. Machine learning techniques classify the high number of ECG signal classes in an automated 
manner [7-11]. Still, the features are typically hand-crafted or extracted heuristically. Hence, machine learning 
requires more effort, is time-consuming, and sometimes, feature representations are often unreliable. In recent 
years, deep learning (DL) has been used for improving feature representation problems in conventional 
machine learning. DL has appeared as the leading technique that uses supervised or unsupervised approaches 
to learn features automatically. DL is closely related to a class of brain development theories discovered as 
feature interaction that can be simultaneously maintained within the more in-depth neural network 
architecture [12]. DL has been implemented in biomedical engineering applications, i.e., classification of CDs. 
Several studies proposed DL in variant architectures, e.g. convolutional neural networks (CNNs) [13, 14], 
recurrent neural networks (RNNs) [15, 16], deep belief networks (DBNs) [17], and Autoencoder [18]. 

Deep learning outperformed the conventional classifier algorithms for classification tasks [19-21]. 
Nonetheless, none of all the aforementioned architectures of DL can appropriate for the clinical problems. 
According to the literature, the existing published articles are limited for ECG-rhythm-based classification 
because the right determination of time-windows in ECG-rhythm classification is not straightforward [22]. The 
optimum window size depends on the task; if it is too small, the network will ignore important information; 
otherwise, it will overfit the training data [23, 24]. For ECG classification, the two most exciting fields of DL 
are RNNs and CNNs, but CNNs, when applied to ECG, cut the window size of a fixed length that eventually 
reduces the classification performance [25]. RNNs can be improved in aspect, as the performance can be 
optimized by providing the classifier with crafted features [25]. RNNs use internal memory to process and 
identify arbitrary input sequences, and these relations between the units form a directed cycle [25]. 
Lui et al. [15] were found that the addition of a recurrent layer improved the CDs classification sensitivity 
using ECG by 28% compared to the CNNs alone. The literature has shown the percentage of sensitivity carries 
out the effectiveness of CDs classification using RNNs. RNNs can be implemented for sequential prediction 
to model the flow of time directly. RNNs and its variants (long short-term memory (LSTM) and gated recurrent 
unit (GRU)) can be implemented in a unidirectional and bidirectional phase. A standard RNNs, unidirectional, 
in which the input is interpreted from left to right (future inputs), i.e., the information flow is a forward direction 
only. Schuster suggested the bidirectional RNNs phase to use both past and future inputs for prediction [26]. 
A unidirectional phase also has limitations because it is difficult to attain future input information from the 
current state. On the contrary, bidirectional does not require fixing of its input data. Besides, future input data 
is accessible from the current state [27]. Yildirim has designed both unidirectional and bidirectional phase for 
ECG beat multiclass classification [28]. The performance result shows the unidirectional showed a 73.10% 
success rate, and the bidirectional phase provided a much better performance with a 79.53% success rate. 
However, in some cases, the unidirectional recurrent networks are still outperformed the bidirectional, by 
windowing, looking ahead, or delaying the output, it can still access future inputs with a large increase in the 
number of parameters [29]. In addition, in terms of time efficiency, bidirectional phase requires more time than 
unidirectional phase in the learning process [28]. 

This paper aims to explore the DL technique with recurrent network classifiers for multiclass 
ECG-rhythm-based classification. The experiments concentrate on the comparison of unidirectional and 
bidirectional recurrent network performance. The comparison is needed to evaluate and figure out the optimum 
phase for ECG multiclass performance (accuracy, sensitivity, specificity, precision, and F1-score) using the 
available public dataset from Physionet. The process consists of some steps; First, the time-window was 
determined for learning in each cell state in the recurrent network. Second, instead of performing binary 
classification, a multiclass classifier was trained with healthy control, myocardial infarction, cardiomyopathy, 
bundle branch block, and dysrhythmia classes. 


2. RESEARCH METHOD 

This paper presents the process of multiclass ECG for healthy control, myocardial infarction, 
cardiomyopathy, bundle branch block, and dysrhythmia classification using a public dataset, the PTB 
diagnostic ECG database. It consists of the following main steps: 1) determining the time-window, 
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2) comparing unidirectional and bidirectional-based learning algorithms, 3) proposing the best model for the 
application. All the phases of the proposed method are presented in Figure 1. 
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Figure 1. ECG multiclass classification workflow 


2.1. ECG raw data 

The ECG data used in this study was collected from PhysioNet: the PTB diagnostic ECG 
database [30]. This study's algorithm tests the database because it provides a total of 549 records: 80 records 
with healthy control and the remaining records of nine cardiac disorders. The disorders can be seen in Table 1. 
Each record includes the conventional 12 leads, and the 3 Frank lead ECGs, with 15 leads, are used in this 
study. This study aims to classify healthy control patients and four CDs, i.e., 1) myocardial infarction, 
2) cardiomyopathy, 3) bundle branch block, and 4) dysrhythmia. The other cardiac disorders in this database 
were discarded because they belonged to classes not considered in the study. Based on our previous work for 
binary classification [20], a fixed window size of 4 seconds was determined for each sequence for ECG 
pre-processing. The total sequence data for five classes was 13.610 sequences. 


Table 1. The PTB Diagnostic ECG database description 
Cardiac Disorders (CDs) Records 





Healthy Control 80 
Myocardial Infarction 368 
Cardiomyopathy 17 
Bundle Branch Block 17 
Dysrhythmia 16 
Total 498 


2.2. Recurrent network classifiers 

The sequence model consists of sequences of ordered elements, recorded with or without a concrete 
notion of time. The recurrent process in the neural network operates on sequences of data. The recurrent 
network takes each element of a sequence, multiplies the element by a matrix, and the previous output is 
summed from the network. There are two directions for the learning phase in neural networks: unidirectional 
and bidirectional [31]. The unidirectional preserves the information of the past and runs the inputs only in 
forward (left-to-right) passes. The bidirectional phase runs the inputs in the forward (left-to-right) and 
backward (right-to-left) passes and preserves the information from both past and future, as presented in 
Figure 2. 

Recurrent neural networks (RNNs) capture relationships among sequential data types. Feedback loops 
at hidden layers of RNNs are unidirectional. Unidirectional means the process from left-to-right, in which the 
flow of the information is only in the forward direction [29]. A unidirectional model can still access inputs by 
windowing, looking-ahead, or delaying output with a reasonable increase in the number of parameters. 
Bengio et al. [32] showed that capturing long-term dependencies using a simple RNNs are difficult because 
gradients tend to either vanish or explode with long sequences. Two techniques have been proposed to solve 
the gradient problems: long short-term memory (LSTM) and gated recurrent unit (GRU). 
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Bidirectional Phase (left-to-right and right-to-left) 





Figure 2. The forward and backward passes of unidirectional and bidirectional 


Schuster presented new concepts of sequence learning in which the information flow is in forward 
and backward feedback [26]. The connections in the forward direction help us learn from previous 
representations, and those going backward help us to learn from future representations. Both connections are 
called “bidirectional RNN (BiRNNs)”, which enables the network to predict outputs using inputs of the entire 
sequence. BiRNNs can be learned using all available input data for a specific timeframe in the past and 
future [33]. BiRNNs are trained with the same algorithm as a regular unidirectional RNNs, because there are 
no interactions between the two types of state neurons. 

Graves et al. proposed an RNNs that uses LSTM cells and computes both forward and backward 
hidden sequences [23]. It is called bidirectional LSTM (BiLSTM). BiLSTM is the LSTM version of the 
BiRNNs architecture and can expand LSTM performance in classification procedures [28]. In contrast to the 
regular LSTM structure, two dissimilar LSTM networks are trained for sequential inputs in the BiLSTM 
architecture [28]. The neuron in a forward state of BiLSTM acts as a unidirectional LSTM structure, but 
bidirectional networks are still much more effective than unidirectional networks [23]. The current hidden state 
depends on two hidden states: the forward and the backward pass of LSTM. The BiLSTM equations in the 
forward and backward passes are given below [31]: 


LSTM |, = tanh(Wx, +Wi-LSTM} +b! (1) 
LSTM), = tanh(W-x, +W;,LSTM,,, +b; (2) 








From (10) and (11), the output of BiLSTM layer at a time t: 
= tanh( WŁ LSTM} +W- LSTM! + by (3) 
where the output depends on LSTM ; and LSTM ¿3 Ao is initialized as a zero vector. 


Cho et al. implemented the GRU to allow each recurrent unit to adaptively capture the dependency of 
different time scales [34]. Similarly, to the BiLSTM, GRU contains a forward GRU which reads the signal 


from W, to W;y , and a backward GRU from Wir; to W; . It is called the bidirectional GRU (BiGRU): 
=GRU(w,),t e[l, 7] (4) 
W, = GRU(w,),¢ € [71] (5) 


Although the GRU is still relatively new and is not commonly used compare to LSTM, some previous literature 
on GRUs has been explored for the ECG classification task [35]. The phase of BiGRU is simpler than BiLSTM 
due to the reduction in the gates and the combination of the forget and input gates into the update gate. The 
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update (z, ) and reset (r,) gates of GRU in the forward pass have been mentioned in (8) and (9). Then, the 
hidden states can be seen below; 


h, = tanh(U,x, +W, (s11 °7,) +b; (6) 
S,=(1-z,)eh, +z, S, (7) 


where W and U are the additional parameters. The notation of ® is a Hadamard product. 


3. RESULTS AND DISCUSSION 

This study presents the one-per-class-coding, which is an extensible algorithm for the binary case. For 
this study, the total number of classes in this study was five, which represents HC, MI, C, BBB, and D. The 
extensible algorithms from binary can use a one-per-class-coding concept. For instance, if a five-class case is 
considered, the output codes for HC, MI, C, BBB, and D were 10000, 01000, 00100, 00010, and 00001, 
respectively. This technique fixes a problem that is encountered with label encoding when working with 
categorical data. Each class's performance is based on five metrics: accuracy, sensitivity, specificity, precision, 
and F1-score. Similar to the previous study [20], the unidirectional and bidirectional multiclass classification 
process in this paper was divided as 90% and 10% for training and testing, respectively. For computing 
platforms, this study used a GeForce RTX 2080 graphics processing unit (GPU). The operating system was 
Windows 10 64-bit. The hyperparameters were the batch size of 512 samples, softmax function in the output 
layer, and the total epochs were 100. 

For the unidirectional multiclass classification process, the class of MI produces good sensitivity, 
precision, and F1-score, higher than those of the other classes in unidirectional RNNs. The results are presented 
in Table 2. This might be due to the quantity of HC, C, BBB, and D data available, which was less than the MI 
of the total data. The imbalanced distribution causes the majority class to achieve higher performance. From a 
total of five classes, unidirectional RNNs have an average of sensitivity, precision, and Fl-score of 90.25%, 
85%, and 87.27%, respectively. To prove the gradient problems in standard RNNs architecture, the fine-tuning 
of unidirectional LSTM and GRU architecture was proposed. In LSTM computation, the average of sensitivity, 
precision, and F1-score increased to 94.17%, 89.58%, and 91.72%, respectively. In another GRU computation, 
the performance showed greater increases, with an average of 95.54%, 89.93%, and 92.31%, respectively. 


Table 2. Unidirectional RNNs, LSTM, and GRU in the testing set 
Cardiac Disorders Class (%) 








Model Metrics HC MI c BBB Mean Value (%) 
Accuracy 96.77 95.48 99.14 98.92 98.78 97.81 
Sensitivity 89.34 96.30 94.23 85.71 85.71 90.25 

RNNs Spesificity 97.99 92.94 99.33 99.55 99.05 97.77 
Precision 88.00 97.69 84.48 90.00 64.86 85.00 
F1-Score 88.66 96.99 89.09 87.80 73.85 87.27 
Accuracy 97.20 96.63 99.28 99.64 99.21 98.39 
Sensitivity 92.15 96.88 96.15 95.08 90.62 94.17 

LSTM _ Spesificity 98.00 95.83 99.40 99.85 99.41 98.49 
Precision 88.00 98.65 86.21 96.67 78.38 89.58 
F1-Score 90.03 97.76 90.91 95.87 84.06 91.72 
Accuracy 97.49 96.77 99.50 99.50 99.28 98.50 
Sensitivity 90.64 97.42 94.74 92.92 100.0 95.54 

GRU _ Spesificity 98.66 94.80 99.70 99.70 99.27 98.42 
Precision 92.00 98.27 93.10 93.33 72.97 89.93 
Fl-Score 91.32 97.84 93.91 9412 84.37 92.31 





In a bidirectional case, the LSTM and GRU still outperformed RNNs and achieved outstanding results, 
as presented in Table 3, even though the different outcomes were not significant. The bidirectional LSTM 
result, presented in Table 3, obtained averages for sensitivity, precision, and Fl-score of 94.16%, 87.21%, and 
90.33%, respectively. Additionally, bidirectional GRU showed averages of sensitivity, precision, and Fl-score 
as 93.71%, 88.78%, and 90.96%, respectively. First, for RNNs in the unidirectional sequence model, the 
average values of accuracy and specificity in five classes of CDs were higher than those for the bidirectional 
pass, as presented in Tables 2 and 3. The averages of accuracy and specificity for unidirectional and 
bidirectional were 97.81% and 97.77%, and 97.33% and 96.66%, respectively. The averages of accuracy and 
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specificity, respectively, were 97.81% and 97.77% for the unidirectional and 97.33% and 96.66% for the 
bidirectional RNNSs. In contrast, the accuracy and specificity for the average of sensitivity, precision, and 
Fl-score, the bidirectional RNNs obtained better performance than the unidirectional RNNs. The averages of 
sensitivity, precision, and F1-score for bidirectional RNNs are 90.26%, 88.21%, and 89.07%. Second, different 
from RNNs, the unidirectional LSTM achieved better overall performances than a bidirectional model. The 
average accuracy, sensitivity, specificity, precision, and Fl-score achieved was 98.39%, 94.17%, 98.49%, 
89.58%, and 91.72%, respectively. How about GRU as the last model? The unidirectional GRU shows better 
performance than the bidirectional GRU, similar to LSTM. 


Table 3. Bidirectional RNNs, LSTM, and GRU in the testing set 
Cardiac Disorders Class (%) 








Model Metrics HC MI c BBB Mean Value (%) 
Accuracy 94.84 94.19 99 99.28 99.35 97.33 
Sensitivity 78.32 96.87 89.29 93.10 93.75 90.26 

RNNs Spesificity 98.03 86.83 99.40 99.55 99.49 96.66 
Precision 88.50 95.28 86.21 90.00 81.08 88.21 
F1-Score 83.10 96.07 87.72 91.53 86.96 89.07 
Accuracy 97.20 96.05 99.28 99.28 99.14 98.19 
Sensitivity 91.71 96.33 98.00 91.67 93.10 94.16 

LSTM  Spesificity 98.08 95.18 99.33 99.63 99.27 98.29 
Precision 88.50 98.46 8448 91.67 72.97 87.21 
F1-Score 90.08 97.38 90.74 91.67 81.82 90.33 
Accuracy 97.70 96.27 99.21 99.28 99.21 98.33 
Sensitivity 93.30 96.78 96.08 89.06 93.33 93.71 

GRU  Spesificity 98.42 94.69 99.33 99.77 99.34 98.31 
Precision 90.50 98.27 8448 95.00 75.68 88.78 
F1-Score 91.88 97.52 89.91 91.94 83.58 90.96 





Compared with the previous work [20] in terms of MI classification in the binary case, unidirectional 
LSTM performed better than GRU. In contrast, both unidirectional and bidirectional GRU was better than 
LSTM overall regarding the multiclass classification case. GRU is computationally more efficient than LSTM, 
due to having only two gates for computing. In time computation, the training process of GRU was faster than 
that of LSTM and did not have much computational power. Notably, in this study, unidirectional LSTM and 
GRU do not have different time computations for each epoch's training. However, for bidirectional LSTM and 
GRU, the time equals 5 and 4 seconds, respectively. Besides, some deep learning techniques have been 
implemented. However, it is still limited to specific ECG leads with fewer classes and only applicable to the 
common unidirectional model. According to the comparison between unidirectional and bidirectional GRU, 
this study proposes the best model, i.e., unidirectional GRU. The unidirectional GRU performance obtained an 
average accuracy, sensitivity, specificity, precision, and Fl-score of 98.50%, 95.54%, 98.42%, 89.93%, and 
92.31%, respectively, for 15-leads of ECG. 

The proposed method's performance is compared with the previous literature in recent years, as 
illustrated in Table 4. Tripathy et al. [36] proposed the least square SVM with an accuracy of 89.93%, 93.95%, 
93.03%, 90.09%, and 85.29% for HC, MI, C, D, and hypertrophy. The average accuracy is 90.34% in all classes. 
In another literature, Acharya et al. [37] identified three classes of Normal, MI, and other CDs, based on discrete 
cosine transform (DCT). The DCT outperformed discrete wavelet transform (DWT) and empirical mode 
decomposition (EMD). The accuracy, sensitivity, and specificity were 98.50%, 99.72%, and 98.46%, 
respectively. In the same year, Acharya et al. [38] explored the four CDs condition classes, then proposed 
Contourlet transformations. Lui et al. [15] proposed combining the CNNs and LSTM, in which the recurrent layer 
increased the sensitivity to classification by 28% relative to the CNNs itself. In the following year, 
Strodthoff et al. [22] only proposed CNNs architecture for classifying anterior myocardial infarction (aMI), 
inferior myocardial infarction, (iMI), and healthy control. Using 10-fold cross-validation, CNNs reach 93.3% and 
89.7%, for sensitivity and specificity, respectively. As mentioned earlier, from the benchmark study with the same 
dataset implementation, our proposed unidirectional GRU architecture can improve 98.5% accuracy. According 
to the comparison between unidirectional and bidirectional, this study proposes the best model, unidirectional 
GRU. The performance of unidirectional GRU obtained an average ACC, SEN, SPE, PRE, and Fl-score of 
98.50%, 95.54%, 98.42%, 89.93%, and 92.31%, respectively, for 15-leads of ECG. In contrast, hitting low-point 
in PRE and F1-score of unidirectional RNN led to a performance of 85% and 87.27%, respectively. 
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Table 4. Benchmark results of CDs classification in previous literature 
Performance (%) 








Authors Method Tead ACC SEN SPE PRE _Fl-Score 
Tripathy et al. [36] Support Vector Machine 12-leads 90.34 - - - - 
Acharya et al. [37] Discrete Cosine Transform 1-lead (Lead II) 98.50 99.72 98.46 - - 
Acharya et al. [38] Contourlet transformations 1-lead (Lead II) 99.55 99.93 99.24 - - 

Lui et al. [15] Convolutional-Recurrent 12-leads - 92.4 97.7 - 94.6 
Neural Network 

Strodthoff et al. [22] | Convolutional Neural 8-leads (I, I, V1, V2, - 93.3 89.7 - - 
Networks V3, V4, V5, V6) 

Proposed method Unidirectional GRU 15-Leads 98.50 95.54 98.42 89.93 92.31 





4. CONCLUSION 

ECG classification process has the greatest role in the clinical diagnosis of CDs. A high number of 
ECG signal classes may have different slopes of signal, timing, and amplitude, which change the ECG 
waveforms. ECG multiclass classification is not as simple as binary case problems. Several conventional 
machine learning techniques have been proposed in previous literature for specific multiclass cases. However, 
the main drawback of machine learning is still about feature definition and representation. This paper 
successfully designs a supervised deep learning model by comparing unidirectional and bidirectional recurrent 
network algorithms for multiclass cases to figure out the best model performance. From this study, the 
unidirectional and bidirectional phase are compared with deep learning for the CDs multiclass classification in 
ECG 15-leads. Despite a large increase in the number of parameters, the future inputs of unidirectional can still 
be accessed by windowing, looking ahead, or delaying output. Yet, in some cases, the bidirectional networks 
outperform unidirectional ones, much faster and more accurate than standard recurrent networks and 
time-windowed multilayer perceptrons (MLPs). Overall, both unidirectional and bidirectional GRU is better 
than LSTM in multiclass cases. Besides, unidirectional GRU shows the best performance model and achieves 
an average accuracy, sensitivity, specificity, precision, and F1-score of 98.50%, 95.54%, 98.42%, 89.93%, and 
92.31%, respectively. 
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